Analysis and Visualization of Classifier Performance with Nonuniform Class and Cost Distributions

نویسندگان

  • Foster Provost
  • Tom Fawcett
چکیده

Applications of machine learning have shown repeatedly that the standard assumptions of uniform class distribution and uniform misclassification costs rarely hold. Little is known about how to select classifiers when error costs and class distributions are not known precisely at training time, or when they can change. We present a method for analyzing and visualizing the performance of classification methods that is robust to changing distributions and allows a sensitivity analysis if a range of costs is known. The method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifters. We then demonstrate analysis and visualization properties of the method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions

Applications of inductive learning algorithms to realworld data mining problems have shown repeatedly that using accuracy to compare classifiers is not adequate because the underlying assumptions rarely hold. We present a method for the comparison of classifier performance that is robust to imprecise class distributions and misclassification costs. The ROC convex hull method combines techniques...

متن کامل

Cost-Sensitive Classifier Evaluation Using Cost Curves

The evaluation of classifier performance in a cost-sensitive setting is straightforward if the operating conditions (misclassification costs and class distributions) are fixed and known. When this is not the case, evaluation requires a method of visualizing classifier performance across the full range of possible operating conditions. This talk outlines the most important requirements for cost-...

متن کامل

A Novel Scalable Multi-class ROC for Effective Visualization and Computation

This paper introduces a new cost function for evaluating the multi-class classifier. The new cost function facilitates both a way to visualize the performance (expected cost) of the multi-class classifier and a summary of the misclassification costs. This function overcomes the limitations of ROC in not being able to represent the classifier performance graphically when there are more than two ...

متن کامل

تحلیل ممیز غیرپارامتریک بهبودیافته برای دسته‌بندی تصاویر ابرطیفی با نمونه آموزشی محدود

Feature extraction performs an important role in improving hyperspectral image classification. Compared with parametric methods, nonparametric feature extraction methods have better performance when classes have no normal distribution. Besides, these methods can extract more features than what parametric feature extraction methods do. Nonparametric feature extraction methods use nonparametric s...

متن کامل

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002